My portfolio: an introduction

Column

What is my corpus?

When the course started, we were asked to choose a corpus. Important was to find something that allowed for meaningful comparisons and contrasts, so we could answer a specific research question. This gave me an interesting idea: since 2014, I have been keeping track of all the songs that I have listened to, using a website called Last FM. Every time a track is played, on media players like Spotify or iTunes for example, a “scrobble” is recorded. This way, I have scrobbled a total of 121,587 tracks (and counting!). What better corpus to choose than a corpus that contains a large part of all the music you have ever listened to? Although at the start of this course I had never worked with an API, and had just learned intermediate skills in R in my 3d year of Psychology, it sounded like an interesting challenge. So, I started googling.

Very soon after I found out that this might become a daunting task. Collecting all my scrobbles from the Last FM API wasn’t the hard part; combining over 100,000 songs with Spotify features however, that was something I was not capable of. Luckily, I found a guide written by Andrew Walker, a researcher from the University of Florida, that included detailed instructions on how to do exactly this. Fetching the features would take the longest of the code, he said, likely up to 10-15 minutes. Obviously, for a dataset as large as mine, that was a gross underestimation. When I got the code working, I cut up the fetching process into two parts, my dataset into 5 parts, and let it all run sequentially. 6 hours of long waiting later, it was finally there: all my scrobbles and corresponding Spotify features! From this point on I knew that analyzing my corpus could lead to some very interesting results.

In this portfolio, I will try to answer one main research question: How does time influence my music listening? To answer this question, I will look at three different modes of time: 1. Hour of the day 2. Month of the year 3. Year of my life (also known as age, perhaps)

First, I will provide a visual overview of my data. Here you can find for each year all sorts of interesting descriptive statistics: how much music I’ve listened to, how my Spotify features have developed over time, and more. The most important explanatory variable being of course, time.

Then, I will conduct more detailed analyses. Certain interesting patterns emerged from my preliminary analyses, how can they be explained? Can I find more information about them in chordograms, keygrams, self-similarity matrices?

2015

Column

Total song plays for the year 2015

Unique album plays for the year 2015

Unique song plays for the year 2015

In 2015, I was still in high school, and I was listening to a lot of Mac DeMarco. I put on his music and listened all his albums through, pretty much on repeat. I have never listened to an artist so much again, which is why the music I listened to in 2014 and 2015 is still at the top of my most played. Since then, I have started listening to a lot more different music, which can also be seen from my album plays in the gauges.

Column

Column

2016

Column

Total song plays for the year 2016

Unique album plays for the year 2016

Unique song plays for the year 2016

At the start of 2016, I was in the middle of my gap year. I was spending a lot of time playing guitar and listening to music, but I was still in the early phases of discovery. I took all this into my first year of studying, where I was still listening to a lot of the music I found in the year before.

Column

Column

2017

Column

Total song plays for the year 2017

Unique album plays for the year 2017

Unique track plays for the year 2017

For me, the year 2017 got off on a strange start. I had quit studying philosophy, I was living in Amsterdam, but I had no idea what direction my life was going in. This was the point that I felt that I needed to make some bigger steps. You can see this very clearly in my music listening: the amount of different albums I had listened to has nearly tripled! It will be interesting to see if we can also find some trends in the Spotify features from this year forward.

Column

Column

2018

Column

Total song plays for the year 2018

Unique album plays for the year 2018

Unique song plays for the year 2018

2018 marked the start of something new. I started studying Psychology, and I was beginning to listen to music on a whole new level. Since I had to spend hours studying in the library, I started listening to different music as well: Boards of Canada was one of my go-to artists for studying, and has slowly become one of my favorite artists.

Column

Column

2019

Column

Total song plays for the year 2019

Unique album plays for the year 2019

Unique song plays for the year 2019

It was 2019, and things started gaining traction. I was discovering more of my would-be favorite artists: I listened to Yo La Tengo and Boards of Canada before, and came across all sorts of different nineties bands I just couldn’t seem to get around. Suddenly I was finding all sorts of electronic music I liked, which can be seen in the genre chart.

Column

Column

All Years

Column

Total songs plays from 2015 to 2019

Unique album plays from 2015 to 2019

Unique song plays from 2015 to 2019

Over the year I have listened to a very large amount of music. I know my music listening habits very well, can we also see this reflected in the data?

Column

Column

Spotify features

Features over the years


My first plan when I had my data ready was to look at the Spotify features over time, to see if interesting patterns emerge. The plot you see on the left is the date, from 2014 to 2020, plotted against some of the features. From the data on my front page you could already tell that my music listening has changed. This is also reflected in the features.

The biggest change over time seems to be the instrumentalness. The acousticness seems to be increasing as well, and the energy is decreasing. How can this be explained?

Different genres…


My cluster analysis seems to provide a lot of insight. On this page I provide more detail on how I conducted this. Using Last FM tags, a feature in Last FM that is a way of finding keywords that describe an artist, I managed to cluster the artists into genre quite succesfully.

As you can see, my preference for genre has changed quite a bit over time. Where I was listening to a lot of indie/lo-fi 2015 (looking at you, Mac DeMarco), I am now listening to more different genres, like electronica/ downtempo and ambient/ classical.

What is the relation between the change of genre over time and the change of features over time?

Different features


In this final plot, you can see the relationship between my two main genres, and the energy and instrumentalness of the music I listen to. As the energy goes down, the amount of lo-fi/ indie music I listen togoes down. Conversely, the amount of electronica/ downtempo I listen to goes up. Also, as the instrumentalness goes up, electronica/ downtempo goes up, and lo-fi/ indie goes down. As explained previously, the electronica/ downtempo music I listen to has a higher instrumentalness and lower energy compared to lo-fi/ indie. The trend we see in this graph is therefore explained by the fact that I have been listening to a lot more electronic music.

The conclusions I draw from this, is that my music taste, but also my preference has changed over the years. As a student, my life has become busier, a lot more fun, but also a lot more exhausting. The moments I actually sit down to listen to music are the moments I like to use to wind down, and that is usually when I put on electronic music. Boards of Canada is a good example of this, and you can find an example of a song on my 2019 page.

Chordograms and keygrams over time

Histogram of they of songs listened in 2015 and 2019


In my portfolio, I want to see how my music listening has changed over the years. To analyse this, I have a corpus that consists of the songs I have listened to since 2014. One thing that might have changed is the key of the songs that I have listened to. To analyse this, I made a histogram of all the keys of the songs in 2015, my final year of high school, and 2019, when I was halfway in my second year of Psychology.

In the histogram, you can see for every key what its proportion is to all of the keys of the songs that I listened to in 2015 and 2019. It seems that D is the most popular, and D# the least. There are slight differences in key between the years, but most noticably, it seems I am listening to far fewer songs in the key of A. Why is this?

To figure this out, I made a table of the artists I listened to in each year that wrote songs in A, and looked at the artists with the highest frequency. Not to my surprise, most songs in A were written by artist like Mac DeMarco, Beach House, Grizzly Bear, and The Black Keys, which are all alternative/ indie artists using guitars. The A chord is popular in songs written on guitar, since it can be played as an open chord, and it goes well with many other open chords. Since 2015, I have started listening to a lot less guitar-centered music, which might explain why I am also listening to fewer songs in the key of A.

Tempo mean and standard deviation


For this plot, I used my top 15 songs from 2015 and 2019. Since I had to use the data I fetched from LastFM, I had a hard time getting the data right so it would work with the compmus package. I managed to get it right however, and the resulting plot is quite interesting.

Here you can see the mean tempo plotted against the SD of tempo, colour indicating tempo, size indicating song duration, and opacity indicating loudness. There seem to be quite large differences between 2015 and 2019. First of all, the range of tempo is much larger in 2019: it spans from ~70 to ~160, while in 2015 the tempo is clustered around 100. This indicates that in 2019, my music taste has become more varied. This can also be seen in song duration and loudness: in 2015 they seem to be similar, while in 2019 it seems to vary more.

Interestingly, the standard deviation of tempo seems to increase with tempo. Does this mean that higher tempo songs also have a higher deviation in tempo? I have no clue.

Genre clustering